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Abstract 

Background: Mycobacterium abscessus complex, the third most frequent mycobacterial complex responsible for 
community- and health care-associated infections in developed countries, comprises of M. abscessus subsp. abscessus 
and M. abscessus subsp. bolletii reviously referred as Mycobacterium bolletii and Mycobacterium massiliense. The diversity 
of this group of opportunistic pathogens is poorly described. 

Results: In-depth analysis of 1 4 published M. abscessus complex genomes found a pan-genome of 6,1 53 proteins and 
core-genome of 3,947 (64.1%) proteins, indicating a non-conservative genome. Analysing the average percentage of 
amino-acid sequence identity (from 94.19% to 98.58%) discriminates three main clusters CI, C2 and C3: CI comprises 
strains belonging to M. abscessus, C2 comprises strains belonging to M. massiliense and C3 comprises strains belonging 
to M. bolletii; and two sub-clusters in clusters C2 and C3. The phylogenomic network confirms these three clusters. The 
genome length (from 4.8 to 5.51-Mb) varies from 5.07-Mb in CI, 4.89-Mb in C2A, 5.01-Mb in C2B and 5.28-Mb in C3. 
The mean number of prophage regions (from 0 to 7) is 2 in CI ; 1 .33 in C2A; 3.5 in C2B and five in C3. A total of 36 
genes are uniquely present in CI , 15 in C2 and 15 in C3. These genes could be used for the detection and identification 
of organisms in each cluster. Further, the mean number of host-interaction factors (including PE, PPE, LpqH, MCE, Yrbe 
and type VII secretion system ESX3 and ESX4) varies from 70 in cluster CI, 80 in cluster C2A, 74 in cluster C2B and 93 in 
clusters C3A and C3B. No significant differences in antibiotic resistance genes were observed between clusters, in 
contrast to previously reported in-vitro patterns of drug resistance. They encode both penicillin-binding proteins 
targeted by (3-lactam antibiotics and an Ambler class A (3-lactamase for which inhibitors exist. 

Conclusions: Our comparative analysis indicates that M. abscessus complex comprises three genomospecies, 
corresponding to M. abscessus, M. bolletii, and M. massiliense. The genomics data here reported indicate differences in 
virulence of medical interest; and suggest targets for the refined detection and identification of M. abscessus. 

Keywords: Mycobacterium abscessus, Mycobacterium bolletii, Mycobacterium massiliense, Araucaria, Mycobacteriophage, 
Prophage 



Background 

The non-tuberculous mycobacterium Mycobacterium 
abscessus was long confused with Mycobacterium chelo- 
nae [1]. Other closely related species include Mycobacter- 
ium salmoniphilum [2], Mycobacterium immunogenum 
[3], Mycobacterium massiliense [4], Mycobacterium bolletii 
[5] and Mycobacterium franklinii [6] altogether forming 
the Mycobacterium chelonae-abscessus complex. This 
complex is the third most frequent mycobacterial complex 
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infecting humans in developed countries besides the 
Mycobacterium tuberculosis and Mycobacterium avium 
complexes [7,8]. Bibliometrics retrieving over 1,700 publi- 
cations in the Medline database illustrates the fact that 
this complex is emerging, causing both sporadic cases and 
outbreaks of community-acquired and health-care associ- 
ated infections [9]. Not only humans but also cats [10,11] 
and dolphins [12-14] are infected while fishes are uniquely 
infected byM. salmoniphilum [2,15]. 

Current nomenclature is that the species M. abscessus 
comprises two subspecies named M. abscessus subsp. 
abscessus and M. abscessus subsp. bolletii. Later taxon ac- 
commodates isolates previously referred as M. bolletii or 
M. massiliense [16]. This nomenclature however may 
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obscure the true diversity of mycobacteria in this complex. 
While the 16S rRNA gene yields an identical sequence for 
M. abscessus and M. bolletii, it shares 99% sequence iden- 
tity with M. massiliense. RpoB gene sequencing founded 
the description of recent species [17-19] but yielded fur- 
ther conflicting results [20-22]. Multilocus sequencing 
analysis [23] and multispacer sequence typing [24] differen- 
tiated M. massiliense from M. bolletii. In this report, the 
previous nomenclature M. abscessus, M. bolletii and M. 
massiliense forming the M. abscessus complex, has been 
retained for clarity. 

The availability of 39 M. abscessus, 13 M. massiliense 
and two M. bolletii genomes in the National Center for 
Biolnformatics (NCBI) genome database provides new op- 
portunities to assess the diversity of this species. Here, we 
review 14 complete published M. abscessus complex ge- 
nomes and compare them with the re-annotated M. tuber- 
culosis H37Rv genome (Table 1) in order to in-depth 
analyse the diversity of M. abscessus. 

Results and discussion 

M. abscessus complex pan- and core-genome 

M. abscessus complex genomes comprise one circular 
chromosome. In addition, M. abscessus ATCC 19977 con- 
tains one 23-kb plasmid identical to the Mycobacterium 



marinum pMM23 plasmid, encoding mer operon and 
mercury reductase protein, which may confer resistance to 
organo-mercury compounds [25]. In order to normalize 
the predicted proteins and to minimize the differences of 
presence/absence of genes and length, coding sequences 
were predicted using prodigal software [26]. We identified 
a total of 70,309 protein-coding sequences which number 
varies from 4,651 to 5,079 in each genome (Table 2). The 
core-genome contains 57,172 protein sequences account- 
ing for 64.15% of the pan-genome. This figure indicates a 
non-conservative genome contrary to that of Mycobacter- 
ium tuberculosis, a conservative-genome pathogen which 
core-genome accounts for 96.1% of the pan-genome [27]. 
Using orthoMCL [28] with a conservative parameter value 
of 50% sequence identity, we categorized these 70,309 pro- 
teins into 6,153 orthologous protein groups, including 
3,947 core-genome groups and 55 strain- specific groups. 

M. abscessus complex diversity 

The average percentage of amino-acid sequence identity 
(AAI) of core proteins was determined as previously de- 
scribed [29]. The AAI values indicate that M. abscessus 
complex forms three main clusters: cluster 1 (CI) includes 
M. abscessus type strain and strains M93, 94, M152 and 
Go06; cluster 2 (C2) contains two subclusters: cluster 2A 



Table 1 List of Mycobacterium abscessus genomes here studied 



Clusters 


Organism 


Isolated from 


Geography 


BioProject 


C1 


M. abscessus Type strain 


human knee infection 


United States 


PRJNA61613, PRJNA15691 


CI 


M. abscessus M93 


sputum sample from a Malaysian patient presenting 
with a prolonged productive cough suggestive of 
a bacterial lower respiratory tract infection 


Malysia 


PRJNA1 80393, PRJNA84203 


CI 


M. abscessus M94 


sputum sample of a Malaysian patient with 
a persistent cough and fever and 
consolidation in the chest radiograph 


Malysia 


PRJNA1 80394, PRJNA88149 


CI 


M. abscessus M152 


acid-fast bacillus positive sputum of a Malaysian man 


Malysia 


PRJNA1 59789 


CI 


M. massiliense strain GO 06 


undergone knee joint surgery 


Brazil 


PRJNA1 70732, PRJNA1 68263 


C2A 


M. massiliense Type strain 


sputum specimen from hemoptoic pneumonia 


Marseille 


PRJNA180742, PRJNA65215 


C2A 


M. massiliense M18 


lymph node biopsy specimen from a Malaysian 
patient suspected of having tuberculous 
cervical lymphadenitis 


Malysia 


PRJNA89593 


C2A 


M. massiliense M154 


bronchoalveolar lavage fluid of a Malaysian 
patient presenting with lower respiratory 
tract infection 


Malysia 


PRJNA89603 


C2B 


M. abscessus 47 J26 


sputum sample from a patient with Cystis fibrosis 


England 


PRJNA1 79981, PRJNA73255 


C2B 


M. abscessus M115 


sputum from a Malaysian patient presenting with 
persistent cough and loss of body weight 
suggestive of pulmonary tuberculosis 


Malysia 


PRJNA89601 


C2B 


M. abscessus M139 


sputum sample of a 26-year-old Nepalese 
male presenting with hemoptysis 


Nepal 


PRJNA1 59701 


C2B 


M. abscessus M172 


putum isolate from a Malaysian patient 


Malysia 


PRJNA89599 


C3A 


M. bolletii Type strain 


respiratory tract specimen collected 
in woman with hemoptoic pneumonia 


Marseille 


PRJNA180015, PRJNA73695 


C3B 


M. abscessus M24 


the bronchoalveolar lavage fluid of a Malaysian patient 


Malysia 


PRJNA89595 
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Table 2 Mycobacterium abscessus core genome and 
unique genes 


Clusters Organism 


CDS 


Unique core 
genome 


Total 
genes 


M. abscessus 


- 


36 


180 


C1 M. abscessus T 


4954 


- 


- 


strain GO 06 


4944 


- 


- 


M93 


4733 


11 


11 


M94 


4841 


10 


10 


M152 


4762 


- 


- 


M. massiliense 


- 


15 


107 


C2A M. massiliense T 


4962 


3 


3 


M18 


4663 


8 


8 


M154 


4651 


- 


- 


C2B 47 J26 


4766 






M115 


4802 


4 


4 


M139 


4754 


4 


4 


M172 


5079 


20 


20 


M. bolletii 




15 


30 


C3A M. bo//et/V T 


4733 


9 


9 


C3B M24 


4960 


23 


23 


M. o6scessus core genome 




3,947 


57,172 


(C2A) includes M. massiliense 


type 


strain and 


strains 



M154 and M18; cluster 2B (C2B) includes strains 47 J26, 
M115, M172 and M139; cluster 3 (C3) includes two sub- 
clusters: cluster 3A (C3A) includes M. bolletii type strain 
and cluster 3B (C3B) includes M. bolletii strain M24 
(Table 3). 



M. abscessus complex proteomes were further aligned 
using Mauve software [30] to infer phylogeny using the 
Neighbor-Net algorithm in the package SplitsTree4 
[31]. The phylogenomic network confirms the three clus- 
ters CI, C2 and C3 (Figure 1A). A phylogenomic tree 
based on gene content (i.e., the presence or absence of 
orthologs) (Figure IB) organizes M. abscessus differently 
from the whole genome concatenated tree (Figure 1A) or 
even the phylogenetic tree based on rpoB gene sequence 
(Figure 1C). Phylogenomic analysis indicates that the M. 
abscessus gene repertoires have different evolutionary his- 
tories and suggests that differential gene loss and lateral 
gene acquisition are playing important roles in the evolu- 
tion of some M. abscessus strains. Notably, the situation of 
strain Go06 is confusing, as it presents 98.4% AAI with M. 
abscessus type strain in CI (Figure 1A) whereas its rpoB 
gene sequence and single nucleotide polymorphisms ana- 
lysis are indicative of M. massiliense [8,32]. Our analyses 
indicate that strain Go06 have an ambiguous classification 
as a chimera between M. abscessus and M. massiliense 
and is the only example compatible with a lateral transfer 
of rpoB gene. 

Functional analysis using Clusters of Orthologous Groups 
database (COG) [33] and BLASTP indicates unique genes 
in CI, encoding hypothetical proteins, proteins implicated 
in transcription, energy production and transport, carbohy- 
drate metabolism and transport, lipid metabolism, nucleo- 
tide metabolism and transport, amino-acid metabolism and 
transport, post-translational modification and inorganic ion 
transport and metabolism (Table 2, Figure 2). Within C2, 
unique genes encode hypothetical proteins, proteins impli- 
cated in transcription and lipid metabolism. Within C3, 



Table 3 Average nucleodite identity and characteristics of Mycobacterium abscessus genomes 



Clusters 


Strains 


Genome 
lenght Mb 


Genome 
GC% 


AAI Vs 

M. abscessus T 


AAI Vs 

M. bolletii T 


AAI Vs 
massiliense T 


M. abscessus C1 


M. abscessus T 


5,09 


62,7 


1 00,00 


95,56 


94,74 




M93 


5,08 


64,2 


97,30 


95,35 


94,76 




M94 


5,1 


64,2 


97,56 


95,67 


94,79 




M152 


4,91 


64,1 


98,59 


96,33 


95,73 




strain GO 06 


5,07 


64,2 


98,35 


95,23 


95,64 


M. massiliense C2A 


M. massiliense T 


5,2 


64,2 


95,56 


96,13 


1 00,00 




M18 


4,89 


64,2 


96,66 


96,09 


97,57 




Ml 54 


4,8 


64,1 


96,14 


95,81 


97,26 


C2B 


M115 


4,98 


64,1 


96,16 


95,36 


96,92 




M172 


5,2 


64,2 


95,30 


94,93 


96,17 




M47 J26 


4,87 


64,1 


96,23 


95,74 


96,93 




M139 


5,05 


64,1 


95,94 


95,64 


96,88 


M. bolletii C3A 


M. bolletii T 


5,05 


64,2 


94,51 


100,00 


95,33 


C3B 


M24 


5,51 


64,2 


94,91 


96,47 


94,20 
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UK 



Figure 1 Phylogenomic analysis of M. abscessus. A. Aligned whole genomes phylogenetic network. B. Gene content phylogeny constructed 
from the matrix of discrete characters using the neighbor-joining method. C. RpoB gene based phylogentic tree using neighbor-joining method. 



unique genes encode hypothetical proteins, proteins impli- 
cated in amino-acid metabolism and transport and transla- 
tion. These unique genes could base a refined identification 
of the three genomospecies. However, we could not exclude 
that these unique genes could be due to a coding sequence, 
which arose de novo, to HGT or gene loss for the other 
subspecies. In the case of absence of a gene, this could also 
be due to a real loss or to an assembly artefact. 

CI strains have been isolated from American and 
Malaysian patients suffering knee infection and lower 



M. abscessus 




M. bolletii 

Figure 2 Core genomes in M. abscessus clusters. 



respiratory infection, respectively (Table 1). C2A strains 
were isolated from Malaysian and French patients suffer- 
ing severe, respiratory tract infections. C2B strains were 
isolated from Nepalese, Malaysian and English patients 
suffering respiratory tract infections, including cystis fibro- 
sis and pulmonary tuberculosis patients. C3A and C3B 
strains were exclusively isolated from patients suffering re- 
spiratory tract infections, in France and Malaysia, respect- 
ively. Therefore, clusters specify the clinical form and 
geographical origin of the infection. 

Altogether, genomics analyses revealed a more het- 
erogeneous structure of M. abscessus complex than the 
one currently suggested by the nomenclature, which 
recognizes only two subspecies within M. abscessws 
[16]. It has been proposed that two genomes exhibiting 
AAI >96% belong to the same species [34,35]. There- 
fore, AAI analysis indicates that M. abscessus is in fact 
comprising of three genomospecies, corresponding to 
previous nomenclature of M. abscessus (CI), M. massi- 
liense (C2) and M. bolletii (C3). Using an AAI <97% 
threshold would further determine two subspecies in 
M. massiliense (C2A and C2B) and in M. bolletii (C3A 
and C3B). Recent whole genome sequencing analyses 
of clinical isolates in Great Britain also clearly distin- 
guished three clusters in agreement with the three here 
reported [8]. All these data support revaluating the tax- 
onomy of M. abscessus complex, to recognize three 
genomospecies M. abscessus (CI), M. bolletii (C2), and 
M. massiliense (C3); and four unnamed subspecies 
C2A, C2B; C3A, C3B. 
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M. abscessus prophagome 

M. abscessus median GC% content is 64.2%, ranging 
from 62.7% (M. abscessus ATCC 19977) to 64.2% (strain 
Go 06). The GC% is not characteristic of the clusters as 
the median GC% content of CI, C2A and C3 is 64.2%, 
close to the median 64.1% GC% content in C2B. 

However, there is a significant 14.7% variation in the gen- 
ome length from 4.8-Mb (M. abscessus M154) to 5.51-Mb 
(M. abscessus M24) with a median of 5.07-Mb. The median 
of genome size is 5.07-Mb in CI, 4.89-Mb in C2A, 5.01- 
Mb in C2B and 5.28-Mb in C3. Differences in the genome 
size correlate with the number of prophage regions which 
are detected in 13/14M. abscessus genomes (Figure 3): M. 
abscessus M154 (M. massiliense C2A) has the smallest gen- 
ome encoding no prophage whereas M. bolletii M24 (C3) 
has the largest genome encoding seven prophage regions 
(Figure 3). Prophage regions comprise up to 5% of the gen- 
ome lenght in M. abscessus M172. The number of pro- 
phage regions in other genomes is diverse, ranging from 
one to six regions encoding putative genes in the subsys- 
tem of phages, prophages, transposable elements and plas- 
mids, which might contribute to species diversity [36]. The 
mean number of prophage regions is 2 in CI, 1.33 in C2A, 
3.5 in C2B and 5 in C3. This observation confirms the par- 
ticularity of C3: M. bolletii CIP108541 T contains a 13-kb 
and a 63-kb prophage whereas M. bolletii strain M24 con- 
tains seven prophage regions including one 17-kb region 
homologous to the M. bolletii CIP108541 T 13-kb region 
and a 27-kb region homologous to the M. massiliense 
CCUG 48898 50-kb region [37,38] (Table 4). M. abscessus 
genomes encode putative phage-related genes necessary 



for phage replication as well as phage-tail protein, phage 
endolysin, capsid proteins (major protein and scaffold pro- 
teins) and phage tape measure protein. Both ends of this 
region encode putative phage integrases. M. abscessus 
genomes encode small prophage-like regions. However, 
only M. bolletii has been reported to produce a myco- 
bacteriophage that we named Araucaria after we recently 
resolved its electron microscopy 3D structure [39]. M. 
abscessus M94 genome harbours one particular pseudo- 
tRNA spanning the region 51,150-57,394 in contig 33, 
which is not observed in the other M. abscessus genomes 
[40]. Phages have been reported to increase virulence of 
their host and encode antimicrobial resistance genes [41]. 
In M. abscessus however, no such genes were identified 
but phages could be targeted for the differentiation be- 
tween the three M. abscessus genomospecies. 

M. abscessus complex resistome 

As all mycobacteria, M. abscessus complex is embedded 
into a hydrophobic cell wall barrier to hydrophilic antibi- 
otics. Accordingly, M. abscessus is multidrug resistant or- 
ganisms exhibiting different drug resistance [42-44]. M. 
abscessus genomes encode many proteins potentially in- 
volved in drug-efflux systems, including members of the 
major facilitator family, ABC transporters and MmpL pro- 
teins; Small Multidrug Resistance-family, a family of lipo- 
philic drug efflux proteins [45]; and a multidrug resistance 
stp protein similar to M. tuberculosis involved in spectino- 
mycin and tetracycline resistance [46]. M. abscessus, M. 
bolletii and M. massiliense were reported to be in-vitro 
susceptible to amikacin; however, comparison with the M. 
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Figure 3 Correlation between Mycobacterium abscessus genomes size (y axis) and the number of prophages (x axis). 
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Table 4 Mycobacterium abscessus prophages 



Clusters 



Strain 


Region 


Region length 


CDS 


Region position 


M. abscessus T 


1 


81 Kb 


110 


1754551-1835095 


M93 


1 


1 6.4 Kb 


33 


197463-213867 




2 


38 Kb 


51 


232006-270072 




3 


53 Kb 


70 


1762720-1815780 




4 


20.2 Kb 


26 


1820768-1841058 


M94 


1 


58.3 Kb 


84 


1039523-1097850 




2 


79.4 Kb 


99 


4959719-5039151 


M152 


1 


48.9 Kb 


53 


1897722-1946683 




2 


34.9 Kb 


44 


4784847-4819818 


Go06 


1 


65 Kb 


44 


1768158-1833157 


M. massiliense T 


1 


12.5 Kb 


21 


1600973-1613514 




2 


31.3 Kb 


33 


1620002-1651385 




3 


50.4 Kb 


69 


3907205-3957680 


M18 


1 


62.8 Kb 


67 


4702725-4765592 


M154 


0 


0 


0 


0 


M115 


1 


1 1 .6 Kb 


10 


1416841-1428481 




2 


77.1 Kb 


102 


1624644-1701770 




3 


55.3 Kb 


79 


3356346-3411651 


M172 


1 


55.1 Kb 


74 


502478-557677 




2 


50.7 Kb 


50 


546109-596832 




3 


59 Kb 


67 


1934186-1993225 




4 


31.1 Kb 


33 


2050376-2081567 




5 


39.4 Kb 


45 


3711805-3751246 




6 


19.6 Kb 


40 


3753466-3773078 


M47J26 


1 


39.9 Kb 


48 


1066714-1106668 




2 


12.4 Kb 


16 


3596408-3608873 




3 


41.4 Kb 


42 


3823414-3864899 


M139 


1 


35.9 Kb 


43 


2906235-2942215 




2 


12.5 Kb 


17 


5033731-5046263 


M. bolletii T 


1 


41.6 Kb 


47 


1684736-1726377 




2 


20.9 Kb 


38 


1727918-1748849 






1 1 A YV\ 
I Z.H t\U 


1 £ 


304 I / ZU-jDj^ I OZ 


M24 


1 


37.1 Kb 


51 


560940-598047 




2 


37 Kb 


37 


1680197-1717263 




3 


17 Kb 


21 


3830340-3847343 




4 


18.1 Kb 


34 


5051771-5069955 




5 


26 Kb 


35 


5155113-5181190 




6 


1 9.2 Kb 


26 


5213195-5232444 




/ 


26.5 Kb 


33 


5312024-5338593 



M. abscessus 



CI 



M. massiliense 



C2A 



C2B 



M. bolletii 



C3A 



C3B 



tuberculosis H37Rv resistome and the antibiotic resistance 
databases indicate that M. abscessus encodes an aminogly- 
coside 29-N-acetyltransferase and aminoglycoside phos- 
photransferases involved in resistance to aminoglycosides. 
Also, genetic analyses disclosed 16S rRNA gene mutations 



conferring aminoglycoside resistance [4,5,47]. Indeed, 
the presence of a single rRNA operon in all of the M. 
abscessus genomes favours the occurrence of dominant 
mutations conferring resistance to aminoglycosides and 
macrolides. M. abscessus genomes encode a rifampin 
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ADP-ribosyl transferase and monooxygenases potentially 
involved in resistance to rifampin and tetracyclines. More- 
over, At. abscessus genomes encode three tet{M) genes con- 
ferring resistance to tetracycline and doxycycline; the 
number of tet(M) genes was correlated to the resistance to 
cyclines in Escherichia coli [48]. However, At. massiliense 
was reported to be susceptible and At. abscessus and At. 
bolletii to be resistant to doxycycline [49]. At. abscessus ge- 
nomes encode resistance to fusidic acid, glycopeptides, 
MLS (Macrolide-Lincosamide-StreptograminB), phenicols, 
rifampicin, sulphonamide and trimethoprim. Also, At. 
abscessus genomes encode FolP homologs conferring resist- 
ance to cotrimoxazole, homolog of UDP-N- acetylglucosa- 
mine 1-carboxyvinyltransferase, a MurA protein conferring 
resistance to fosfomycin and homologs of 23S rRNA meth- 
ylases conferring resistance to macrolides. Also, At absces- 
sus genome encodes an erm(41) gene which mutations 
were reported to confer clarithromycin resistance [50]. In- 
vitro tests showed that At. massiliense clinical isolates could 
be distinguished from At. abscessus isolates for their suscep- 
tibility to ciprofloxacin [51] whereas At. bolletii isolates 
were reported to be resistant to all quinolones [52]. A mu- 
tation at codon 90 in gyrA gene was reported in clinical 
isolates of At. abscessus exhibiting high resistance to cipro- 
floxacin [51]. This observation contrasts with our genome 
analysis, which found no such mutations, suggesting that 
other mechanisms of resistance may be involved in high- 
level resistance to quinolones [52]. Accordingly, we found 
that At abscessus mycobacteria encode qepA2, a plasmidic 
gene conferring quinolone resistance in gram-negative bac- 
teria [53]. At. abscessus mycobacteria were reported to be 
in-vitro resistant to penicillin, amoxicillin, cefoxitin, ceftri- 
axone, cefotaxime and imipenen [4,5]. This contrasts with 
the fact that they encode Penicillin-binding proteins (PBPs), 
targets for [3-lactam antibiotics (except for tabtoxinine-p- 
lactam, which inhibits glutamine synthetase), which are es- 
sential for peptidoglycan synthesis [54,55]. At. abscessus 
genomes encode an Ambler class A |3-lactamase homolo- 
gous to [3-lactamases in gram-negative bacteria and to two 
|3-lactamases in At. tuberculosis, p-lactamases inhibitors 
have not been evaluated against At abscessus sensu lato 
mycobacteria. 

Genome-based analysis of host-interactions 

At abscessus are ubiquitous environmental organisms in 
soil and water [9] where they may have to cope with 
amoeba: At. chelonae, At. abscessus, At. massiliense and 
At. immunogenum were reported to survive within 
Acanthamoeba polyphaga tropohozoites and cysts [5]. 
Accordingly, our analyses indicate that At. abscessus ge- 
nomes encode factors implicated in host interactions. 
The mean number of genes encoding proline-glutamate 
(PE), proline-proline glutamate (PPE), 10-kDa lipopro- 
tein antigen precursor (LpqH), Mammalian Cell Entry 



(MCE), oxidoreductase (Yrbe) and type VII secretion 
system is of 70 in CI, 80 in C2A, 74 in C2B and 93 in 
C3. In At. abscessus, rough colonies lack mmpLi (a gene 
required for glycopeptidolipid biosynthesis) and lost 
surface colonization, replication into human macrophages 
and stimulation of innate immune response; these obser- 
vations suggested that glycopeptidolipid was a virulence 
factor [56-58]. Accordingly, glycopeptidolipids are re- 
quired for sliding motility [59] and biofilm formation [60]. 
Glycopeptidolipids have also been suspected to inhibit 
phagocytosis of At. avium subsp. avium [61]. At. abscessus 
genomes encode MCE proteins similar to At. tuberculosis 
H37Rv. MCE operon promotes internalization of At. tu- 
berculosis by mammalian cells [62] and initiates rapid in- 
duction of transcription of genes involved in substrate 
trafficking [63] . The number of mce operons which corre- 
lated with pathogenicity [64], varies from six in C2B to 13 
in C3. In parallel, At. abscessus genomes encode 12 (CI) 
to 21 (C3A, C3B) copies of Yrbe proteins. As for secretion 
systems, recent evidences showed that mycobacteria 
evolved specialized type VII secretion systems to transport 
extracellular proteins across the cell wall [65] . Type VII se- 
cretion systems ESX-1 and ESX-5 are involved in cell-to- 
cell migration of At. tuberculosis [65,66] . In At. abscessus, 
our analyses indicate that ESX-3 and ESX-4 systems are 
conserved (Figure 4). However, At. abscessus M139 (C2B) 
lacks two proteins of the ESX-3 system and At. abscessus 
M93 (CI) lacks ESAT-6 like and CFP-10-like proteins se- 
creted by the ESX-4 system. Interestingly, At. abscessus 
M18 (C2A) encodes ESAT-6 and CFP-10 proteins secreted 
by ESX-1 system. In addition, there are two or three PE 
and six (At. massiliense, At. abscessus M115 or At. absces- 
sus 47 J26) to 12 (At bolletii M24) PPE proteins, which 
are reported to be involved in the virulence of At. tubercu- 
losis [67]. Our analyses further indicated that proteins re- 
lated to phenazine biosynthesis, homogentisate catabolism, 
phenylacetic acid degradation and DNA degradation might 
have been transferred from Actinobacteria (e.g. Rhodococ- 
cus spp., Streptomyces spp.) and pseudomonas {Pseudo- 
monas aeruginosa and Burkholderia cepacia). Although 
distantiy related, these bacteria share the same ecosystem 
as At abscessus within cystic fibrosis microbiota. 

Conclusions 

Our in-depth genomic analyses indicate that At. absces- 
sus has a non-conservative genome, suggesting the possibil- 
ity of on-going transfer of additional genetic material. 
Unsurprisingly, At abscessus has already acquired antibiotic 
resistance. Also, phages have mediated diversity and hori- 
zontal gene transfer which drived the rapid evolution of 
this complex. Indeed, gene transfers have driven the evolu- 
tion of At. abscessus towards three different genomospe- 
cies At. abscessus, At. massiliense and At. bolletii; and the 
evolution of four different yet unnamed subspecies. Each 
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Figure 4 Heatmap clusterisation of Mycobacterium abscessus type VII secretion system compared to Mycobacterium tuberculosis H37Rv. 

M. abscessus strains are listed on the left side of the map. 



genomospecies has its own specificities in terms of gen- 
ome size, prophagome and genome content. We identi- 
fied 66 genes uniquely present in each genomospecies; 
these genes could be used in refined detection and iden- 
tification of M. abscessus organisms. These genomic dif- 
ferences support differences in host interactions and the 
clinical presentation of infection with M. massiliense 
(C2A and C2B) being more virulent than the two other 
genomospecies. Host-interaction factors may contribute 
to the ability of M. abscessus to colonize mammalian hosts 
where its respiratory tract habitat put it in close proximity 
to other serious opportunist pathogens which can act as 
donors of additional host-interaction factors. 

Here reported informations regarding differences be- 
tween M. abscessus genomespecies will help understand- 
ing their pathogenesis factors and could reveal new, more 
specific targets for drug design and diagnosis tools. 

Methods 

Genome dataset 

The whole genomes of 14 M. abscessus strains were down- 
loaded from Genbank (Table 1). The genomic sequence, 
either contigs or finished genomes were concatenated to 
one pseudogenome per genome. 

Prophage detection and genome annotation 

Protein sequences were predicted using prodigal software 
[26] to generate normalized files containing the combined 
protein sequences of all 14 genomes. Prophage regions 
were detected using PHAST software (Table 4). Predicted 



proteins were annotated using BLASTp against the Na- 
tional Center for Biotechnology Information (NCBI) non- 
redundant (NR) database, UNIPROT (http://www.uniprot. 
org/), the Clusters of Orthologous Groups (COG) [33] 
and a home-made antibiotic resistance gene database. 

Genome clustering and calculation of core genomes 

Proteome sequences were compared using by BlastP and 
pair-wise alignments using ClustalW and the ANI was de- 
termined by the mean percentage of nucleotide sequence 
identity of core proteins [29]. We clustered the M. absces- 
sus homologous genes using orthoMCL [28] on the trans- 
lated protein sequences of all predicted genes with a 
conservative parameter value of 50% sequence identity. 
The determination of the different unique core genomes 
was based on the homology clusters found by orthoMCL. 

Phylogenetic analysis 

M. abscessus proteomes were aligned using Mauve soft- 
ware [30] to infer phylogeny using the Neighbor-Net algo- 
rithm in the package SplitsTree4 [31]. The orthologous 
group data found by orthoMCL were used to construct a 
whole-genome phylogenetic tree based on gene content. 
We generated a matrix of binary discrete characters ("0" 
and "1" for absence and presence, respectively) [68]. Using 
this matrix, we constructed a phylogenetic tree imple- 
menting the neighbor-joining (NJ) method within Split- 
sTree4 [31]. 
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